Quarto Notebook Example

Author

Jim Urick

The source code for this notebook is in this GitHub folder.

Markup Languages

Markdown, \(\LaTeX\) and HTML are markup languages. Notebooks generally (including Jupyter and others) use Markdown so you can write explanations in a more readable format. With Quarto, you can also include \(\LaTeX\) in the Markdown and then render to HTML or PDF.

Markdown, R Markdown, Quarto

Markdown is very easy to learn; find a cheat sheet like this one for the basic syntax like

  • lists
  • bold, italics (also with underscores), code, strikethrough
  • Images: This is the caption

You can render notebooks (with the “Render” button) to a variety of formats, like PDF, HTML, slides or plain Markdown. The header here is set up to render to a standalone HTML file.

R Markdown adds some more functionality to plain Markdown, like code chunks

x <- rnorm(10000)
hist(x, 50)

and inline R code: sin(1)=0.841471. In RStudio, you can create a new code chunk with Ctrl-Shift-I on a blank line.

Quarto is a newer version of R Markdown made by Posit (formerly named RStudio). Older notebook files may have an “.rmd” extension for R Markdown, while newer “.qmd” files are Quarto.

Quarto adds some fancier things like tabset panels:

x <- rnorm(10000)
hist(x, 50, main = "Normal")

x <- rgamma(10000, shape = 2)
hist(x, 50, main = "Gamma")

LaTeX with MathJax

MathJax is a Javascript engine that renders \(\LaTeX\) in R Markdown. It doesn’t do everything that \(\LaTeX\) does, but it does enough for most purposes. Find some cheat sheets like this one or this one to learn how to do basics.

You can include inline math with a single $, like \(e^{\pi i}\), or you can use $$ for “display” math:

\[ \frac{d}{dx}\int_{a}^{x}f(t)\,dt=f(x) \]

Show multiple lines of work with the align environment.

\[ \begin{align*} \Gamma(n+1) & =\int_0^\infty x^{n}e^{-x}\,dx \\ & =nx^{n-1}e^{-x}\Bigg|_0^\infty+n\int_0^\infty x^{n-1}e^{-x}\,dx \\ & =n\cdot\Gamma(n) \end{align*} \]

Use pmatrix to create matrices with parentheses; there are others like bmatrix for square bracket [] matrices. For example, if \(f_n\) is the \(n^{\textsf{th}}\) Fibonacci number, note that

\[ \begin{pmatrix} 1 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} f_{n} \\ f_{n-1} \end{pmatrix} =\begin{pmatrix} f_{n+1} \\ f_{n} \end{pmatrix}. \]

Let \(\phi_{\pm}=\tfrac12(1\pm\sqrt5)\) and diagonalize that matrix to show

\[ \begin{align*} \begin{pmatrix} f_{n+1} \\ f_{n} \end{pmatrix} & =\begin{pmatrix} 1 & 1 \\ 1 & 0 \end{pmatrix}^n \begin{pmatrix} 1 \\ 1 \end{pmatrix} \\ & =\frac{1}{\sqrt{5}} \begin{pmatrix} 1 & -\phi_- \\ -1 & \phi_+ \end{pmatrix} \begin{pmatrix} \phi_+ & 0 \\ 0 & \phi_- \end{pmatrix}^n \begin{pmatrix} \phi_+ & \phi_- \\ 1 & 1 \end{pmatrix} \begin{pmatrix} 1 \\ 1 \end{pmatrix}. \end{align*} \]

Macros

You can also use some \(\TeX\) macros like \def\eps{\varepsilon} in case you want a shortcut for \varepsilon=\(\varepsilon\) but don’t want to overwrite \epsilon=\(\epsilon\). However, \def commands like that can be a little odd in a notebook because that code doesn’t render to anything: \(\def\eps{\varepsilon}\).

It’s often better to use javascript, like the file “latex.js” in this directory, which is included in the header (see my notes and the MathJax documentation). I used it to define \def\R{\mathbb{R}}`` so`=\(\R\).

R and the tidyverse

If you’re interested in working with data in R, it’s a good idea to read Hadley Wickham’s R for Data Science. The metapackage tidyverse includes most of the packages described there.

library(tidyverse)

Two of the main packages are

To get practice, it can help to know that R loads some dataframes automatically like mtcars from the old datasets package.

head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Pipes

You should get used to the pipe operator |>. It’s two characters, a vertical line | and greater than >, but the Fira Code font combines them into something like \(\LaTeX\)’s \triangleright (\(\triangleright\)). There’s also an operator %>% you can use in the same way as |>; it’s just the older name for the same function.

The pipe operator is a simple syntax change: x |> f(y) is equivalent to f(x,y). It allows you to chain together a bunch of function compositions in a way that’s more readable.

For example, in the next code chunk, starting from the mtcars dataframe, you

  1. select only the columns mpg and disp,
  2. define the column nonsense by adding those two columns, and
  3. filter down to just the first 6 rows.
mtcars |>
  select(mpg, disp) |>
  mutate(nonsense = mpg + disp) |>
  head()
                   mpg disp nonsense
Mazda RX4         21.0  160    181.0
Mazda RX4 Wag     21.0  160    181.0
Datsun 710        22.8  108    130.8
Hornet 4 Drive    21.4  258    279.4
Hornet Sportabout 18.7  360    378.7
Valiant           18.1  225    243.1

Without |>, you have to read the function composition from the inside, and the indentation keeps drifting to the right:

head(
  mutate(
    select(mtcars, mpg, disp),
    nonsense = mpg + disp
  )
)
                   mpg disp nonsense
Mazda RX4         21.0  160    181.0
Mazda RX4 Wag     21.0  160    181.0
Datsun 710        22.8  108    130.8
Hornet 4 Drive    21.4  258    279.4
Hornet Sportabout 18.7  360    378.7
Valiant           18.1  225    243.1

The function |> is actually from the magrittr package, but it’s most often used to chain together functions from dplyr. Both of those where loaded with the library(tidyverse) command.

Plots

The next code chunk shows a simple scatter plot using the ggplot2 package, included in the tidyverse. It just consists of

  • an aesthetic mapping, specifying that the mpg column is the x-axis and disp is the y-axis, and
  • a geom, specifying that records will be displayed as points.
mtcars |>
  ggplot(aes(x = mpg, y = disp)) +
  geom_point()

You can add color with a color aesthetic mapping. In the geom, I also specify that all points have a bigger size and some transparency.

mtcars |>
  ggplot(aes(mpg, disp, color = factor(cyl))) +
  geom_point(size = 3, alpha = 0.6)

Making a plot is generally an iterative process. Maybe this one is good enough.

mtcars |>
  mutate(cyl = factor(cyl)) |>
  ggplot(aes(mpg, disp, color = cyl, group = cyl)) +
  geom_point(size = 3, alpha = 0.6) +
  geom_smooth(method = 'lm', formula = y~x, se = FALSE) +
  theme_bw() +
  theme(legend.position = "inside", legend.position.inside = c(0.85, 0.7)) +
  ggtitle("disp vs mpg, grouped by cyl",
          subtitle = "with linear regression lines")

You can also make plots more interactive with packages like plotly.

library(plotly)

That has a function ggplotly() that translates static ggplot2 objects to plotly plots.

p <-
  mtcars |>
  mutate(cyl = factor(cyl)) |>
  ggplot(aes(mpg, disp, color = cyl, group = cyl)) +
  geom_point(size = 3, alpha = 0.6) +
  geom_smooth(method = 'lm', formula = y~x, se = FALSE) +
  theme_bw() +
  theme(legend.position = "inside", legend.position.inside = c(0.85, 0.7)) +
  ggtitle("disp vs mpg, grouped by cyl",
          subtitle = "with linear regression lines")

ggplotly(p)

As you may notice, ggplotly() doesn’t translate things 100%: the legend above is outside the grid and there’s no subtitle. It generally does pretty well, and you can use the internet to fix the rest.